Dynamic access control in a content-based publish/subscribe system with delivery guarantees

ABSTRACT

Improved access control techniques for distributed messaging systems such as content-based publish/subscribe systems are disclosed. For example, a method for providing access control in a content-based publish/subscribe system, wherein messages are delivered from publishing clients to subscribing clients via a plurality of brokers, includes the following steps/operations. One or more changes to an access control policy are specified. An access control version identifier is associated to the one or more changes. The one or more changes are sent to one or more brokers of the plurality of brokers that have a publishing client or a subscribing client associated therewith that is affected by the one or more changes. The access control version identifier associated with the one or more changes is sent to each of the plurality of brokers.

FIELD OF THE INVENTION

The present application generally relates to distributed messagingsystems and, more particularly to access control techniques in suchsystems.

BACKGROUND OF THE INVENTION

A popular approach employed in distributed messaging systems for usewith asynchronous distributed applications is the content-basedpublish/subscribe (pub/sub) messaging approach. A content-based pub/subsystem includes publishing client devices or machines (“publishers”)that generate messages and subscribing client devices or machines(“subscribers”) that register interest in messages matching thepredicate/Boolean filter specified in their subscription. The systemensures timely delivery of published messages to all interestedsubscribers, and typically contains routing broker devices or machines(“brokers”) for this purpose. Thus, in the content-based pub/subparadigm, the information providers (i.e., publishers) and consumers(i.e., subscribers) are decoupled, since publishers need not be aware ofwhich subscribers receive their messages, and subscribers need not beaware of the sources of the messages they receive.

For many applications, content-based pub/sub systems are required toprovide strong service guarantees (such as reliable, in-order, gaplessdelivery), high scalability to support large number of clients, highservice availability and high performance/throughput. In order toachieve these goals, typical systems: (1) propagate and consolidatesubscription information toward publishers; (2) using the subscriptioninformation, perform content filtering to achieve good network bandwidthutilization and scalability; and (3) and utilize redundant network pathsfor high service availability.

An important but often ignored issue that can hinder the commercialadoption of pub/sub systems is security assurance provided toapplications. In particular, applications want to ensure theconfidentiality, integrity and authenticity of events as they aredisseminated. It is often required that only trusted sources should beallowed to publish events, and that information/events should only bedistributed to authorized or paying subscribers. A closely relatedproblem is accounting and auditing which enables the billing ofsubscribers based on usage.

This issue of access control in pub/sub systems is further complicatedby the issue of changing access control policies. One problem associatedwith a change in the access control policy is the disruption in pub/subservice that occurs when the change is being made.

Accordingly, there is a need for improved access control techniques fordistributed messaging systems which are able to efficiently andeffectively account for access control policy changes.

SUMMARY OF THE INVENTION

Principles of the invention provide improved access control techniquesfor distributed messaging systems such as content-basedpublish/subscribe systems.

For example, in one aspect of the invention, a method for providingaccess control in a content-based publish/subscribe system, whereinmessages are delivered from publishing clients to subscribing clientsvia a plurality of brokers, includes the following steps/operations. Oneor more changes to an access control policy are specified. An accesscontrol version identifier is associated to the one or more changes. Theone or more changes are sent to one or more brokers of the plurality ofbrokers that have a publishing client or a subscribing client associatedtherewith that is affected by the one or more changes. The accesscontrol version identifier associated with the one or more changes issent to each of the plurality of brokers. In one embodiment, the accesscontrol version identifier is a number.

Each of the one or more changes to the existing access control policymay be stored and implemented in the system as a batch, having theaccess control version identifier associated therewith, so as touniquely identify the one or more changes from one or more previouschanges to the existing access control policy of the system.

Each of the plurality of brokers may be at least one of a publisherhosting broker (PHB), a subscriber hosting broker (SHB) and anintermediate broker (IB), and the above specifying, associating andsending steps are performed in accordance with a security administrator.

The security administrator may send the one or more changes and theassociated access control version identifier to PHBs and SHBs that havea publishing client or a subscribing client associated therewith that isaffected by the one or more changes.

An SHB, upon receipt of the one or more changes, may compute arestricted subscription for an affected client. The SHB may then sendthe restricted subscription along with the access control number to oneor more other brokers.

A PHB may, upon receipt of the one or more changes, apply the one ormore changes to the access control policy to obtain the latestpublishing rights and the access control version identifier. Further, aPHB may, upon receipt of a data message to be published, apply thelatest publishing rights to the message. The PHB may then send the datamessage along with the access control number to one or more otherbrokers.

An IB may maintain a control version vector.

In another aspect of the invention, a content-based publish/subscribesystem for providing message delivery from a publishing client to asubscribing client includes a plurality of brokers operatively coupledto one another via a network, each of the brokers being configured as atleast one of a publisher hosting broker (PHB), a subscriber hostingbroker (SHB) and an intermediate broker (IB). The system also includesat least one administrator being operatively coupled to at least aportion of the plurality of brokers, and being configured to store andupdate at least one access control policy within the system. At least aportion of the plurality of brokers and the at least one administratorare configured to implement a change to the access control policy withinthe system by including an access control version identifier with one ormore messages sent therebetween, wherein the access control identifieruniquely identifies the access control policy that is in effect, suchthat the change in the access control policy deterministically anduniformly applies to publishing clients and subscribing clientsassociated with one or more principals affected by the change in theaccess control policy.

Advantageously, the system may thereby guarantee deterministic anduniform access control semantics to all subscribers on behalf of thesame principals in the system, even when a crash and restart of at leastone broker loses the non-persistent state of the at least one broker,and even when at least one link in the network fails and its connectionis re-established, causing at least one message transmitted over the atleast one link to be dropped, duplicated, or delivered out of order.That is, different subscribers on behalf of the same principal willreceive exactly the same sequence of messages (modulo subscriptionfilter differences), even when they are connected at differentsub-networks in the system and even when their sub-networks mayexperience different communication latency and network or routing brokerfailures.

The plurality of brokers may be configured to eliminate a need forpersistent storage of access control state at brokers other than thePHBs. A PHB may be configured to persistently store the control versionidentifier associated with the latest access control policy. A PHB maybe configured to persistently store access control version identifiersassociated with access control policies that were in effect at the timea message was published. Multiple paths may exist between a PHB and SHB,and IBs on different paths need not maintain identical access controlstate. At least a portion of the IBs may maintain access control versionvectors, with one version per SHB, rather than maintaining accesscontrol rules. Each SHB may maintain the latest access control rules forprincipals that are connected thereto.

An SHB may subscribe to access control rule changes for principalsconnected thereto. IBs may filter access control rule changes byprincipal. Reliable delivery may be used to ensure that access controlrule changes are received by the SHBs that need them. An SHB thataccepts a connection from a new principal may use a request-responseprotocol to initialize the access control rules for that principal.

An SHB may intersect subscriptions with the latest access control rulesand assign and maintain the control versions of the intersectedsubscriptions using the version of the control rules. The SHB maypropagate the resulting subscription with the access control versionidentifier to upstream IBs. IBs may maintain subscription state withaccess control version identifiers.

PHBs may include the access control version identifier in data messages.IBs may use subscription state to filter the message if the accesscontrol version identifier in the data message is no more than theaccess control version number in the subscription state, and otherwisesend the message downstream. An SHB may check equality of the controlversion identifier of the intersected subscriptions that match a messagewith the control version of the message to enforce subscribing accesscontrol rules.

These and other objects, features and advantages of the presentinvention will become apparent from the following detailed descriptionof illustrative embodiments thereof, which is to be read in connectionwith the accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1A is a block diagram illustrating at least a portion of acontent-based publish/subscribe system including an exemplary network ofbrokers, according to one embodiment of the invention.

FIG. 1B is a block diagram illustrating a computing architecture for oneor more components of a content-based publish/subscribe system,according to one embodiment of the invention.

FIG. 2 is an information flow diagram illustrating an implementation ofa broker network, according to one embodiment of the invention.

FIG. 3 is a block diagram illustrating an exemplary implementation of aservice model, according to one embodiment of the invention.

FIG. 4 is a flow diagram illustrating a summary of an access controlpolicy distribution and message delivery protocol, according to oneembodiment of the invention.

FIGS. 5A and 5B are block diagrams illustrating a portion of acontent-based publish/subscribe system implementing an access controlpolicy distribution and message delivery protocol, according to oneembodiment of the invention.

FIGS. 6A through 6E are flow diagrams illustrating an access controlpolicy distribution and message delivery protocol, according to oneembodiment of the invention.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS

Illustrative embodiments of the present invention will be describedbelow in conjunction with an illustrative content-basedpublish/subscribe system including a plurality of broker devices orbrokers which are preferably connected together to form an overlaynetwork, although alternative connection arrangements are contemplatedby the invention. The plurality of brokers are responsible for deliveryof one or more messages sent by publishers to subscribers based, atleast in part, on the content of these messages and/or on filteringpredicates requested by the subscriber.

The brokers may be grouped according to certain functions. For example,one or more of the brokers may preferably be specialized for hostingpublishers. These brokers may be referred to herein as publisher hostingbrokers or PHBs. Furthermore, one or more of the brokers may preferablybe specialized for hosting subscribers. These brokers may be referred toherein as subscriber hosting brokers or SHBs. Between the PHBs and SHBs,there may exist any number of intermediate hops that include routingand/or filtering. The brokers at such hops may be referred to herein asintermediate brokers or IBs. For ease of explanation, it will be assumedthat each of the different brokers are separate entities. In an actualimplementation, however, it is contemplated that any one broker may becapable of performing the functions of one or more PHBs, SHBs and IBs.

Before describing illustrative embodiments of an access control servicemodel and an access control policy distribution and message deliveryprotocol of the invention, an illustrative content-based pub/sub systemin which such model and protocol may be implemented will be described inthe context of FIGS. 1A, 1B, and 2. Such content-based pub/sub system isdescribed in detail in the U.S. patent application identified as Ser.No. 10/177,474, filed on Jun. 21, 2002, and entitled “Gapless Deliveryand Durable Subscriptions in a Content-Based Publish/Subscribe System,”the disclosure of which is incorporated by reference herein. It is to beunderstood, however, that the access control techniques of the inventionare not limited to this illustrative system.

FIG. 1A illustrates at least a portion of a content-based pub/sub systemincluding an exemplary network of brokers, formed in accordance with oneaspect of the invention. Publishers 101 a, 101 b, 101 c and 101 dpreferably establish connections to particular PHBs, 102 a and 102 b,over corresponding client connections 107 a, 107 b, 107 c and 107 d,respectively. The client connections may generally be any type ofcommunication medium for conveying transmitted information, including awireless communication link, such as, for example, infrared, radiofrequency, satellite, microwave, etc., and a dedicated communicationconnection, such as, for example, telephone, cable, fiber optic, etc.Preferably, each of the client connections are a reliable,first-in-first-out (FIFO) connection, such as, but not limited to, aTransport Control Protocol/Internet Protocol (TCP/IP) socket connection.

Independently, subscribers 105 a, 105 b, 105 c and 105 d preferablyestablish connections to SHBs, 104 a and 104 b, over correspondingclient connections 108 a, 108 b, 108 c and 108 d, respectively. Clientconnections 108 a, 108 b, 108 c and 108 d are preferably consistent withclient connections 107 a, 107 b, 107 c and 107 d previously described.The PHBs, 102 a, 102 b, and SHBs, 104 a, 104 b, may be connected to IBs,103 a and 103 b, via broker-to-broker connections 106 a, 106 b, 106 c,106 d, 106 e, 106 f, 106 g and 106 h. Assuming the network employs agapless delivery protocol and connection failures and message reorderingis tolerated, it is not necessary for the broker-to-broker connectionsto use reliable FIFO protocols such as TCP/IP, but may advantageouslyuse faster, less reliable protocols, thereby increasing systemthroughput.

As shown in FIG. 1B, each publisher, subscriber and broker (denoted as150) may be implemented in accordance with a processor 152, memory 154and one or more input/output (I/O) devices 156. It is to be appreciatedthat the term “processor” as used herein is intended to include anyprocessing device, such as, for example, one that includes a centralprocessing unit (CPU) and/or other processing circuitry (e.g.,microprocessor). Additionally, it is to be understood that the term“processor” may refer to more than one processing device, and thatvarious elements associated with a processing device may be shared byother processing devices. The term “memory” as used herein is intendedto include memory and other computer-readable media associated with aprocessor or CPU, such as, for example, random access memory (RAM), readonly memory (ROM), fixed storage media (e.g., a hard drive), removablestorage media (e.g., a diskette), flash memory, etc. Furthermore, theterm “input/output devices” or “I/O devices” as used herein is intendedto include, for example, one or more input devices (e.g., keyboard,mouse, network interface card, etc.) for entering data to the processor,and/or one or more output devices (e.g., printer, monitor, networkinterface card, etc.) for presenting the results associated with theprocessor.

Accordingly, an application program, or software components thereof,including instructions or code for performing the methodologies of theinvention, as will be further described herein, may be stored in one ormore of the associated storage media (e.g., ROM, fixed or removablestorage) and, when ready to be utilized, loaded in whole or in part(e.g., into RAM) and executed by the processor 152. Thus, eachpublisher, subscriber, and broker may be, for example, either astandalone computer, a process or application running on a computer, or,to minimize delay due to system failures, a cluster of redundantprocesses running in a distributed manner within multiple computers.

With reference now to FIG. 2, there is shown an exemplary informationflow diagram illustrating an implementation of the broker network, inaccordance with one aspect of the present invention. As apparent fromthe figure, the illustrative information flow diagram for the brokernetwork comprises a plurality of nodes (depicted as ovals), referred toherein as information streams, and edges or paths (depicted as arrowsbetween a source oval and a destination oval), referred to herein astransforms. The information flow diagram may be constructed by a systemadministrator, either statically or in response to subscriptionrequests. The information flow diagram defines paths between sourceinformation streams 211 a and 211 b, referred to herein as pubends, anddestination information streams 216 a, 216 b, 216 c, 216 d, 216 e, 216f, 216 g, 216 h, 216 i, 216 j, 216 k, 216 l (collectively, 216),referred to herein as subends, via intermediate information streams 212a, 212 b, 212 c, 212 d, 212 e, 212 f, 212 g, 212 h, 212 i, 212 j, 212 k,212 l, 212 m, 212 n, 212 o, 212 p, 212 q, 212 r, 212 s, 212 t, 212 u,212 v, 212 w, 212 x, 212 y, 212 z, 212 aa, 212 bb, 212 cc, 212 dd(collectively, 212).

Preferably, each publisher delivers messages to exactly one pubend,while each subscriber receives messages from one or more subends withina single SHB. Each transform is either a filter transform 214 a, 214 b,214 c, 214 d, 214 e, 214 f, 214 g, 214 h, 214 i, 214 j, 214 k, 214 l,214 m, 214 n, 214 o, 214 p, 214 q, 214 r, 214 s, 214 t, 214 u, 214 v,214 w, 214 x, 214 y, 214 z (collectively, 214), a link transform 213 a,213 b, 213 c, 213 d, 213 e, 213 f, 213 g, 213 h, 213 i, 213 j, 213 k,213 l, 213 m, 213 n, 213 o, 213 p, 213 q, 213 r, 213 s (collectively,213), or a merge transform 215 a, 215 b. Information can be delayed,lost, or reordered while passing through a given transform, although inpractice this will typically only occur over links.

Filters preferably include a predicate denoting a content filter. Forexample, filter 214 e specifies that only messages having contentmatching “Loc=NY” will pass. A filter having no predicate associatedtherewith (e.g., 214 i and 214 j) passes all content, and is essentiallyequivalent to a link.

Each broker 202 a, 202 b, 203 a, 203 b, 204 a, 204 b preferably has atimer or clock 222 a, 222 b, 222 c, 222 d, 222 e, 222 f, respectively,associated therewith. Although the methodologies of the presentinvention do not require that these clocks be synchronized to real time,performance may be improved if these clocks are at least approximatelyaccurate, or synchronized with respect to one another. In addition tohaving a clock associated with a particular broker, PHBs 202 a and 202 brequire a stable storage medium 221 a and 221 b, respectively,associated therewith. Stable storage is intended to include nonvolatilememory, such as, for example, RAM, fixed storage, removable storage,etc. The remaining brokers (e.g., SHBs and IBs) 203 a, 203 b, 204 a, 204b do not require stable storage, but may instead use “soft” state. Theability to only require stable storage in PHBs, and to allow SHBs andIBs to utilize soft state, advantageously distinguishes the presentbroker network from other protocols which employ store-and-forwardtechniques. These conventional protocols generally require stablestorage associated with each broker in the network.

The path(s) from pubends to a given client's subend(s) determine whichmessages that client is guaranteed to receive. Specifically, each pathpropagates messages satisfying a conjunction (i.e., logical AND) of thepredicates corresponding to each filter along the path. If there aremultiple paths associated with a given node, that node receives messagesthat satisfy a disjunction (i.e., logical OR) of the path filters.

By way of example only, consider a client C1, which is associated withsubends 216 a and 216 b. Based on the paths, namely, 214 o, 213 i, 214e, 213 a and 214 a, between subend 216 a and a pubend, C1 will receivemessages published to pubend 211 a that satisfy the filters “Topic=1” &“Loc=NY” & “p>3,” where the symbol “&” represents a logical ANDoperation. Likewise, based on the paths, namely, 214 q, 213 m, 214 g,213 f and 214 d, between subend 216 b and a pubend, C1 will receivemessages published to pubend 211 b that satisfy the filters “Topic=1” &“Loc=NY” & “p>3.”

Each subend is preferably an ordered stream. Therefore, client C1 willreceive all relevant messages from pubend 211 a (i.e., those messageshaving content which satisfy the filters associated with the givenpaths) in the order they were published, and all relevant messages frompubend 211 b in the order they were published. However, between amessage published to pubend 211 a and another message published topubend 211 b there is no necessary order. This implies that,irrespective of publish times, it is generally unpredictable whether agiven message from pubend 211 a will arrive before or after a givenmessage from pubend 211 b. This is an example of a client subscriptionwith content selection (e.g., Topic=1 & Loc=NY & p>3) and publisherorder.

In contrast, consider client C2, which is associated with a singlesubend 216 e. As shown in FIG. 2, the paths, namely, 214 s, 213 k, 214i, 215 a, 213 b, 214 b, 213 e and 214 c between subend 216 e and apubend are the merge of filter “Topic=2” from pubend 211 a and filter“Topic=2” from pubend 211 b, further filtered by “i=1.” Since client C2has a single subend associated therewith, it receives a single orderedstream. This is an example of a client subscription with contentselection (e.g., Topic=2 & i=1) and total order. Notice, that client C3has a subscription with the same content selection (e.g., Topic=2 & i=1)and total order, and will therefore receive the same messages in thesame order as client C2. This uniform total order property of thepresent broker network is a consequence of the fact that the mergetransform is deterministic, meaning that two merge transforms receivingidentical input information streams will produce the same merged outputinformation stream.

The illustrative information flow diagram of FIG. 2 directs the gaplessdelivery methodologies of the present pub/sub system. In summary, eachinformation stream preferably keeps track of what has occurred duringeach particular interval of time or tick. Thus, each information streampreferably comprises a data message (or a silence) and a curiosityrepresenting how eager it is to learn about that tick. Knowledge flowsdownstream (i.e., in the direction of the arrows), while curiosity flowsupstream (i.e., in a direction counter to the direction of the arrows).Accordingly, subends deliver messages when they detect that a gaplesssequence of knowledge ticks has been extended. Pubends, on the otherhand, log messages in stable storage. These logs maintained in stablestorage may be subsequently utilized as arbiters of curiosity if noother broker has knowledge about what happened during a given tick.

Given the above description of an illustrative content-based pub/subsystem, access control principles of the invention will now beillustratively described.

Principles of the invention realize that a publish/subscribe systemshould be able to continue functioning without having to shut down toenact new access control policies, as is often required by missioncritical applications such as electronic-trading in financial markets.

Accordingly, illustrative embodiments of the invention provide dynamicaccess control in a pub/sub system with content-based filtering androuting, reliable delivery and redundant routes. To this end, adeterministic service model is provided for dynamic access control in acontent-based pub/sub system. The deterministic guarantee of such aservice model enables precise control over event confidentiality and isindependent of issues like client locations, network latency andfailures. The administrator has knowledge of the exact point in time ina published message stream when an access control change takes place.The semantics of reliable delivery under this model is clearly defined,that is, two or more subscribers of the same principal will receive thesame set of messages when the access control policy of the principal ischanged, no matter where the subscribers are connected.

It is to be understood that a “principal,” as used herein, generallyrefers to an entity. Such entity can be positively identified andverified through an authentication technique. A principal is grantedcertain rights to use the resource or service of the system. Actualclients of the system can act on behalf of a principal and use theresource/service of the system. When the system is servicing a client,the system does so according to the rights granted to the principal onbehalf of which the client connects. Examples of types of principalswill be explained below.

Illustrative embodiments of the invention also provide an algorithmsupporting this deterministic service model. Using this algorithm,access control changes are performed uniformly across all brokers towhich the affected principals connect. There is no need for the systemto obtain consensus from these brokers, which could compromise theefficiency of the system and timeliness of enacting the change. Thealgorithm is: (1) efficient in that it allows access control enforcementto be distributed across the network and performed close to eventsources; and (2) highly available by allowing routing to choose anyredundant paths without requiring consensus among these paths.

A. Service Model

In this section, a deterministic service model of dynamic access controlis illustratively described. More particularly, we describe the variousentities involved in dynamic access control and their roles, acontent-based form for specifying access control rules and the clearstarting points of access control changes.

In the illustrative service model, there are two types of entities thatare involved in access control:

(1) Security administrator. The security administrator is the ultimateauthority of access control in the system. The security administratordecides (based on external factors such as client service contracts) theaccess rights for client principals (defined below) and/or whether thereshould be any change to their existing access rights. The securityadministrator instructs the system of his/her decisions through anadministrative interface.

In a large system, there may be multiple security administrators. As thechanges made by each administrator may affect overlapping sets ofclients, the system accepts the changes in a transactional andserializable manner. For the purpose of this illustrative descriptionand simplicity of discussion, we consider the security administrators asa single entity that initiates a single sequence of policy changes. Weassume that policy changes from different administrators are ordered andconflicts in changes are resolved.

(2) Client principals. Clients in the illustrative pub/sub system haveassociated principals which are decided/verified by the system throughauthentication when clients connect. It is to be understood that theterm “client” as illustratively used herein may refer to either apublisher (i.e., a publishing client) or a subscriber (i.e., asubscribing client). In fact, a client can act as both a publisher and asubscriber. That is, a client can connect to the system, publishmessages or subscribe and receive messages. The client's capability toconnect, publish and subscribe/receive messages is regulated by theaccess rights of its principal. For example, if a client is interestedin receiving stock quotes, financial news and reports of IBM Corporationbut its principal has only access rights to stock quotes, the clientwill not receive any news and reports even though it requests them.

The access control rules in the illustrative system are associated withprincipals. Multiple clients of the same principal can connect atdifferent places in the system.

There are two types of principals in the illustrative system: (1) group;and (2) individual. A group principal is a collection of individuals orrecursively, other group principals. Access rights granted to a groupprincipal are automatically granted to all members of the group, andrecursively to the members of a member group.

The access rights of a principal include the right to connect, the rightto publish and the right to subscribe to and receive messages. We adopta content-based form for specifying access control rules of these threerights. An access control rule takes the following form of a tuple ofthree elements:

-   -   [Principal, Access type, Content filter]

A rule of such form specifies that a principal has the right to connectto the system, publish or subscribe to messages matching a contentfilter. While publish and subscribe rules can take a non-trivial filter,connect rules are specified with true or false to indicate the right toconnect or not. For example, the rules that allow a principal “John Doe”to connect and subscribe to stock quotes are specified as follows:

-   -   [John Doe, Connect, True]    -   [John Doe, Subscribe, type=‘quote’]

The access control rules are maintained internally in positive forms inthat all rules specify what a principal is allowed to do. Negative formsspecifying what a principal is not allowed to do are provided as aconvenience to security administrators and are converted internally topositive forms by taking the negation of the content filters.

Under the regulation of access control rules, a client is allowed topublish messages that match the publish rules of its principal and isallowed to receive messages that match both its subscription filters andthe subscribing rules of its principal. This allows the system toprovide: (1) information authenticity by allowing only authorizedsources to publish messages; (2) information confidentiality by onlydistributing messages to authorized subscribers; (3) protection againstdenial-of-service (DoS) attack initiated by malicious subscribers whorequest large number of messages that are only going to be discarded.This large number of messages can result in congestion in the networkand impair the system's capability to serve other clients.

Group and individual principals share the same form of connect, publishand subscribe access rights. In addition, a new type of rule, referredto as member list, exists for group principals. For example, a premiumsubscribers group that includes Jane Smith and James Brown and hassubscribing rights to all stock quotes, news and reports has thefollowing access control rules:

-   -   [Premium group, Member list, {(Jane Smith, James Brown}]    -   [Premium group, Subscribe, type=‘quote’ or type=‘news’ or        type=‘report’]

All members in a group are automatically granted the access rights ofthe group. Thus, the access rights of an individual principal are theunion of the individual's rights and the rights of all group principalsto which it belongs. Hence, Jane Smith and James Brown will have accessto all stock quotes, news and reports in addition to other access rightsthey are granted.

FIG. 3 illustrates an example of the illustrative service model.

The illustrative deterministic service model provides clear startingpoints for access control changes. In this model, access controlrules/changes 301 are initiated by a security administrator 302 at anadministrative console and stored into a persistent storage 303 calledACL DB (access control database). At any time, the securityadministrator may specify a number of changes pertaining to one or moreprincipals. These changes are considered as a batch that must beenforced atomically. After the security administrator confirms eachbatch of changes, the changes are propagated throughout broker network304.

The brokers, to which publishers (not shown) connect, host one or moremessage streams. Each stream contains, in order, the messages publishedby one or more publishers. For example, as shown in FIG. 3, broker 305hosts message stream 306.

For each of these streams, the broker picks a starting point to enactthe new access control rules specified by the security administrator.The starting point is chosen in a way such that: (1) successive batchesof changes get later starting points; and (2) the start point is lateenough so that no messages after the starting point could have beendelivered according to the old rule. This can be achieved by designatinga newly published message on the stream as the starting point. Thestarting point information is sent back to the security administratorfor future inquiries and references. The new rules are enforceduniformly throughout the system on all messages after the startingpoints, no matter where a principal's client connects.

Also shown in FIG. 3 is subscriber 307 receiving one or more messages308 to which the subscriber subscribed, an example of which will befurther explained below.

We illustrate the effect of an access control policy change using anexample in which a principal John Doe's subscribing rights went throughthree phases of changes: (1) John Doe became a member of the promotionalgroup which had subscribing access only to stock quotes; (2) John Doebecame a premium subscriber and subsequently gained subscribing accessto all three types of financial information (stock quotes, financialnews and reports); and (3) John Doe's premium subscription expired and,as a result, lost subscribing rights to financial news and reports.

In relation to FIG. 3, it is assumed that a subscriber with principalJohn Doe connected to the system and requested a subscription ofissue=‘ibm’. Under the service model, every time the access rights ofJohn Doe change, the system provides a clear starting point in eachmessage stream such that: (1) a message before the starting point isdelivered to the client if and only if the message satisfies both thesubscription filter and access right filter before the change; and (2) amessage after the starting point is delivered to the client if and onlyif it satisfies both the subscription filter and access right filterafter the change. In the stream in the example of FIG. 3, if thestarting points chosen are message 100 for the first access change,message 103 for the second access change and message 106 for the thirdchange, the messages delivered to the client will be 100, 103, 104, 107,109. Notice that non-quotes are only delivered in the range 103 . . .105. In a system that has more than one message stream, this activityhappens to all streams, each with its individual start points.

The routing topology employed by the network of brokers is an abstracttopology model of spanning trees of nodes where each node includesmultiple virtual brokers that are redundant and can workinterchangeably. Trees are noncyclic structures that simplify the taskof loop-free routing. Tree nodes with redundant brokers provide highavailability.

Recall from the illustrative pub/sub system described above in thecontext of FIG. 1 (in which the illustrative service model of theinvention can be implemented) that we refer to a broker where publishersconnect as a publisher hosting broker (PHB) and a broker wheresubscribers connect as a subscriber hosting broker (SHB). Forsimplicity, we will discuss routing from the standpoint of one PHB. Theabstract network may be constructed such that any physical brokerhosting clients implements a virtual broker in a leaf node. Hence, inthis model, the SHBs only reside in the leaf nodes of the tree; andthere is only one PHB and it resides in the root of the tree. Thedirection up-stream/downstream points toward/away from the root. Becausea client connects to one broker, each leaf node contains one broker.This topology model can represent a large range of practical topologiesas one can transform a graph with redundant paths into a topology underthis model by grouping brokers into tree nodes and inter-broker linksinto tree edges.

One illustrative implementation of access control is one in which thePHB and intermediate brokers forward all published messages that matchclient subscriptions to SHBs, and SHBs enforce access control bydelivering messages that match not only a client's subscription but alsoits access rights. Such a solution will be a perfectly correctimplementation, but it may waste considerable bandwidth sending messagesthat will be later discarded.

Subscription propagation is an optimization which may result in fewerwasted messages being sent to SHBs in exchange for requiring the PHB andintermediate brokers to acquire knowledge about subscription predicatesand perform filtering. By propagating clients' access rights along withtheir subscriptions, further savings in communication cost may beachieved.

Providing the deterministic service guarantee described above ischallenging in a content-based system deployed over a network withredundant paths. Due to content-based routing, gaps can not be detectedby traditional methods such as publisher-assigned sequence numbersbecause each subscriber may request a completely unique sequence ofmessages to be delivered. Reliability in a content-based system hencerequires brokers on the routing path to assist in gap detection.

Multiple paths, communication asynchrony and failures complicate thepropagation of subscription and access control information as redundantbrokers on alternative routes may have different subscription and accesscontrol information from each other. If messages from the same publishedstream are routed through those brokers, they are matched to differentsets of subscription and access control filters. As a result, gaps canappear in the message sequences delivered to subscribers.

Illustrative protocols for subscription propagation are disclosed in Y.Zhao et al., “A General Algorithmic Model for Subscription Propagationand Content-based Routing with Delivery Guarantees,” RC23669, IBMResearch 2005, and Y. Zhao et al., “Subscription Propagation inHighly-available Publish/Subscribe Middleware,” ACM/IFIP/USENIX 5_(th)International Middleware Conference, 2005, the disclosures of which areincorporated by reference herein. These protocols preserve reliabledelivery and enable free routing choices on any of the redundant pathsfor system availability and load sharing. Furthermore, such protocolsprovide that a subscription's reliable delivery starting point on apublished message stream can be chosen as any point in the streamprovided that none of the messages after the starting point has beenacknowledged so that the system may have reclaimed the persistentstorage occupied by the message.

Below, we use such reliable delivery and subscription propagationprotocols as building blocks for constructing an efficient and highlyavailable distributed protocol that enforces the deterministic semanticsof dynamic access control to pub/sub clients. We adopt a domain-basedtrust model. All brokers within the same domains trust each other.Brokers that do not trust each other should be put into differentdomains and cross-domain communication is regulated by assigning accesscontrol rules according to their trust levels. For simplicity, wediscuss the protocols under one trusted domain. This is of practical useas in many commercial cases, pub/sub systems are deployed in a managedenvironment under the complete control of an administrator. The conceptscan be extended to multiple trusted domains by treating a domain as apublish/subscribing client and assigning a principal to the domain. Theclients connected to the system through an un-trusted domain can onlyaccess messages that satisfy both the domain's right and their ownaccess right.

B. Access Control Policy Distribution and Message Delivery Protocol

FIG. 4 illustrates an access control policy distribution and messagedelivery protocol that provides a deterministic service guarantee ofmessage delivery. As outlined, the protocol provides for: distributingaccess control information to brokers that host relevant principals(401); restricting publishing activities by accepting only messagessatisfying the publisher's publishing rights (402); restricting clientsubscriptions using their subscribing rights (403); propagatingrestricted subscriptions and hence enforcing access control in therouting brokers by performing content filtering on both the clients'subscriptions and access rights (404); and final enforcement of accesscontrol at the SHB (405). We describe each of these protocol aspectsbelow with reference back to components illustrated in FIGS. 1A through3.

As previously mentioned, access control policies are maintained in apersistent storage called ACL DB. The security administrator makespolicy changes in transactional batches to the ACL DB. Access controlpolicies are associated with a control version, which is an integercounter. Each transactional batch brings the ACL DB into a new controlversion. The new access control rules are assigned with the new versionnumber. As old access control rules may still be in effect for somemessages, the ACL DB contains a mixture of access control rules withdifferent versions. To avoid sending the whole state, the ACL DBdistributes the new version of access control by publishing it as anincremental change.

Each PHB/SHB maintains a cache of latest access control rules forclients that are currently connected. When a client with a new principalconnects, the broker retrieves an initial version of access controlrules for the principal through a request/reply protocol with the ACLDB. The broker also establishes a subscription for receiving futureaccess control changes for the connected principal. The subscriptionpropagation and reliable delivery service ensures that the brokerreceives every access control change after obtaining an initial versionof access rules for a connected principal.

When a PHB receives a new version of access control rules, it updatesits cache. The PHB picks a starting point for the new version as thenext message that will be published. Newly published messages will onlybe accepted if they match the current publishing rights in effect. Inaddition, newly published messages are transmitted in the systemcarrying the access control version that is in effect. We now describehow subscribing rights are enforced.

In a system where content-based routing is purely based on clientsubscriptions and access control is only enforced at the broker whereclients connect, the routing brokers may send messages that only matchthe client's subscriptions but not their access rights. These messageswill only be discarded later and result in waste of system bandwidth.

We treat access control information as another type of information thatcan affect message routing in addition to client subscriptions. Thus,instead of propagating the original client subscription filters to therest of the network, the SHBs propagate a restricted form of filtersthat are the intersection of the client subscription filter and thelatest version of content-based access rules in the SHBs cache. When theaccess control rules change, the SHB re-computes the restrictedsubscriptions for all affected clients/principals with the new versionof rules. The resulting subscriptions are propagated upstream atomicallytogether with the control version. The upstream routing brokers handlethe subscriptions without having to know whether the subscriptions arerestricted. The upstream routing broker only needs to maintain a vectorof control versions for each SHB in its downstream.

As we propagate restricted filters, content-based routing is based onthe intersection of client subscription filters and access controlrules. This allows the routing brokers to participate in access controlas well as the SHBs.

As mentioned above, a message in the system carries the control versionthat is in effect for the message. When routing the message for adownstream, a routing broker compares this version of the message withthe subportion of its control version vector for SHBs located in thatdownstream. The message is only filtered out if it does not match therestricted subscriptions from the downstream and every element of thesub-portion of the broker's control version vector is no less than thatof the message. In the case that the broker does not have a sufficientlylarge control version vector, the broker may conservatively send themessage to the downstream. For example, a broker b sends the message toa downstream broker anyway, even though the message may be wasteful,i.e., does not match the subscription filter and the subscribing rightsof any client connected at the sub-network rooted at the downstreambroker. If the broker b had a sufficiently large control version vector,the broker b may be able to filter out and withhold from sending amessage to a downstream broker b’ if the message does not match anysubscription filter it (broker b) maintains for a sub-network rooted atthe downstream broker b’.

The ultimate enforcer of access control is the SHBs, as intermediaterouting brokers may conservatively send messages that do not match aclient's accessing right.

The SHB first examines whether it has received the access control rulesof the version required by the message. If not, the SHB delays theprocessing of the message until the version of the access control rulesarrives. If the SHB receives the control version of rules, the SHBexamines each restricted subscription that matches the message. If therestricted subscription has the same control version as the message, themessage is delivered to the subscribing client. Otherwise, the messageis not delivered to the client.

The use of control versions not only allows the message deliveryalgorithm to implement the clear starting point feature of the servicemodel, but also allows the system to be more asynchronous and faulttolerant. The distribution of access control changes with a controlversion number allows each broker in the system to proceedasynchronously instead of waiting for a slow or crashed broker if atransactional session of broadcasting to all brokers is utilized. Evenin the case that a majority of brokers fail in a routing tree node, newaccess control rules can be enacted and the remaining broker canparticipate in enforcing access control without having to obtain anagreement from its redundant peers. When a broker recovers, even whenits control version may lag behind, the broker can still participate inmessage routing, utilizing its part of the network capacity that wouldotherwise stay idle.

The use of a control rule cache of only connected principals allows thesystem to scale even in the large scale environment where the number ofprincipals is large. The SHBs only need to know access control rules forthe principals that are locally connected.

Referring now to FIGS. 5A and 5B, a portion of a content-basedpublish/subscribe system implementing an access control policydistribution and message delivery protocol, according to one embodimentof the invention, is shown. It is to be understood that, in theillustrated example, only one security administrator, one PHB, one IB,and one SHB are shown for simplicity purposes. Thus, actual systems willinvolve multiple such components. Further, a PHB may be coupled to anSHB directly, i.e., without having an IB therebetween. Also, it is to beunderstood that the security administrator, the PHB, the SHB and the IBrefer to computing devices that perform the steps described. Suchcomputing devices may be configured as illustrated in FIG. 1B, such thatthe steps/operations described can be executed via the processor andmemory arrangement.

As shown, in this example, security administrator 502 sends a messageincluding an access control policy change (i.e., batch of access controlpolicy changes) with the assigned control version number, as explainedabove, to appropriate brokers. Appropriate brokers would be any brokersthat host a client on behalf of a principal affected by the policychange. In this example, appropriate brokers are shown as PHB (504) andSHB1 (508).

As explained above, and as illustrated in FIG. 5A, when the accesscontrol rules change, the SHB re-computes the restricted subscriptionsfor all affected clients/principals with the new version of rules. Theresulting subscriptions are propagated upstream atomically together withthe control version. The upstream routing brokers (IB 506 in thisexample) handle the subscriptions without having to know whether thesubscriptions are restricted. The upstream routing broker only needs tomaintain a vector of control versions for each SHB in its downstream.

Further, as explained above, and as illustrated in FIG. 5B, when a PHBreceives a new version of access control rules, it updates its cache.The PHB picks a starting point for the new version as the next messagethat will be published. The newly published messages are transmitted bythe PHB (504) in the system carrying the access control version that isin effect.

Referring now to FIGS. 6A through 6E, flow charts illustrate stepstaken, in the access control policy distribution and message deliveryprotocol, by an SHB (FIGS. 6A and 6B), a PHB (steps 6C and 6D), and anIB (FIG. 6E). These flow charts illustrate examples of aspects of theaccess control policy distribution and message delivery protocolexplained above.

FIG. 6A illustrates processing performed by the SHB upon receivinginformation on a new version of access control.

In step 602, the SHB receives access control rules and version numbers.

In step 604, the SHB updates its access control policy cache.

In step 606, the SHB computes the intersection of the clientsubscription filter and the access control policy filter.

In step 608, the SHB propagates to the upstream broker(s) theintersected subscriptions and the version number of the control rulesused.

In step 610, the SHB checks whether there are any buffered messages withthe same control version number as this latest ACL change.

If “no” in step 610, the SHB does nothing more with respect thereto(block 612).

If “yes” in step 610, the SHB checks, for each of these messages,whether it matches some intersected subscriptions (step 614).

If “no” in step 614, the SHB discards the message (step 616).

If “yes” in step 614, the SHB checks, for such an intersectedsubscription, whether its control version number is the same as that ofthe message (step 618).

If “no” in step 618, the SHB discards the message (step 616).

If “yes” in step 618, the SHB sends the message to the subscriber of theintersected subscription (step 620).

FIG. 6B illustrates processing performed by the SHB upon receiving a newclient subscription.

In step 622, the SHB receives the client subscription filter.

In step 624, the SHB retrieves the latest access subscribing rules forthe principal of the client from the access control policy cache.

In step 626, the SHB computes the intersection of the clientsubscription filter and the access control policy filter.

In step 628, the SHB propagates to the upstream broker(s) theintersected subscriptions and the version number of the control rulesused.

FIG. 6C illustrates processing performed by the PHB upon receivinginformation on a new version of access control.

In step 630, the PHB receives the access control rules and the versionnumbers.

In step 632, the PHB updates its access control policy cache.

FIG. 6D illustrates processing performed by the PHB upon receiving a newdata message.

In step 634, the PHB receives a data message.

In step 636, the PHB checks whether the message matches the latestpublishing rights of the principal of the publisher.

If “no” in step 636, the PHB discards the message (step 638).

If “yes” in step 636, the PHB sets the control version number using thelatest version number on the message and sends the message (step 640).

FIG. 6E illustrates processing performed by the IB upon receiving a datamessage.

In step 642, the IB receives a data message.

In step 644, the IB checks whether the message control version is lessthan or equal to the control version vector elements of the IB.

If “yes” in step 644, the IB checks whether the message matches thesubscriptions from the downstream (step 646).

If “no” in step 646, the IB discards the message (step 648).

If “no” in step 644 or “yes” in step 646, the IB sends the message tothe downstream (step 650).

It is to be understood that the above processing steps are intended tobe illustrative in nature and, thus, an access control policydistribution and message delivery protocol of the invention may performless or more processing steps, other processing steps, and/or the aboveprocessing steps in a different order.

Advantageously, as illustrated herein, a service model according toillustrative embodiments of the invention is able to substantiallyguarantee deterministic and uniform access control semantics to allsubscribers on behalf of the same principals in the system. This is thecase even when a crash and restart of one or more brokers causes the oneor more brokers to lose the non-persistent state of the access controlpolicy in effect. This is also the case even when at least one link inthe network fails and its connection is re-established, causing at leastone message transmitted over the link to be dropped, duplicated, ordelivered out of order. That is, different subscribers on behalf of thesame principal will receive exactly the same sequence of messages(modulo subscription filter differences), even when they are connectedat different sub-networks in the system and even when their sub-networksmay experience different communication latency and network or routingbroker failures.

Further, such a service model allows for the ability to enforce bothpublishing and subscribing access control using a content-based formthat can be applied to content-based publish/subscribe system, as wellas the ability for all brokers other than just the SHBs to participatein subscriber access control without having to maintain access controlrules.

Although illustrative embodiments of the present invention have beendescribed herein with reference to the accompanying drawings, it is tobe understood that the invention is not limited to those preciseembodiments, and that various other changes and modifications may bemade by one skilled in the art without departing from the scope orspirit of the invention.

1. A method of providing access control in a content-basedpublish/subscribe system, wherein messages are delivered from publishingclients to subscribing clients via a plurality of brokers, the methodcomprising the steps of: specifying one or more changes to an accesscontrol policy; associating an access control version identifier to theone or more changes; sending the one or more changes to one or morebrokers of the plurality of brokers that have a publishing client or asubscribing client associated therewith that is affected by the one ormore changes; and sending the access control version identifierassociated with the one or more changes to each of the plurality ofbrokers.
 2. The method of claim 1, wherein each of the one or morechanges to the existing access control policy are stored and implementedin the system as a batch, having the access control version identifierassociated therewith, so as to uniquely identify the one or more changesfrom one or more previous changes to the existing access control policyof the system.
 3. The method of claim 1, wherein each of the pluralityof brokers is at least one of a publisher hosting broker (PHB), asubscriber hosting broker (SHB) and an intermediate broker (IB), andwherein the specifying, associating and sending steps are performed inaccordance with a security administrator.
 4. The method of claim 3,wherein the security administrator sends the one or more changes and theassociated access control version identifier to PHBs and SHBs that havea publishing client or a subscribing client associated therewith that isaffected by the one or more changes.
 5. The method of claim 3, whereinan SHB, upon receipt of the one or more changes, computes a restrictedsubscription for an affected client.
 6. The method of claim 5, whereinthe SHB sends the restricted subscription along with the access controlnumber to one or more other brokers.
 7. The method of claim 3, wherein aPHB performs at least one of the steps of: upon receipt of the one ormore changes, applying the one or more changes to the access controlpolicy to obtain the latest publishing rights and the access controlversion identifier; and upon receipt of a data message to be published,applying the latest publishing rights to the message.
 8. The method ofclaim 7, wherein the PHB sends the data message along with the accesscontrol number to one or more other brokers.
 9. The method of claim 3,wherein an IB maintains a control version vector.
 10. Apparatus forproviding access control in a content-based publish/subscribe system,wherein messages are delivered from publishing clients to subscribingclients via a plurality of brokers, comprising: a memory; and at leastone processor coupled to the memory and operative to: (i) specify one ormore changes to an access control policy; (ii) associate an accesscontrol version identifier to the one or more changes; (iii) send theone or more changes to one or more brokers of the plurality of brokersthat have a publishing client or a subscribing client associatedtherewith that is affected by the one or more changes; and (iv) send theaccess control version identifier associated with the one or morechanges to each of the plurality of brokers.
 11. A content-basedpublish/subscribe system for providing message delivery from apublishing client to a subscribing client, the system comprising: aplurality of brokers operatively coupled to one another via a network,each of the brokers being configured as at least one of a publisherhosting broker (PHB), a subscriber hosting broker (SHB) and anintermediate broker (IB); at least one administrator being operativelycoupled to at least a portion of the plurality of brokers, and beingconfigured to store and update at least one access control policy withinthe system; and wherein at least a portion of the plurality of brokersand the at least one administrator are configured to implement a changeto the access control policy within the system by including an accesscontrol version identifier with one or more messages sent therebetween,wherein the access control identifier uniquely identifies the accesscontrol policy that is in effect, such that the change in the accesscontrol policy deterministically and uniformly applies to publishingclients and subscribing clients associated with one or more principalsaffected by the change in the access control policy.
 12. The system ofclaim 11, wherein the plurality of brokers are configured to eliminate aneed for persistent storage of access control state at brokers otherthan the PHBs.
 13. The system of claim 11, wherein at least one PHB isconfigured to persistently store the control version identifierassociated with the latest access control policy.
 14. The system ofclaim 13, wherein at least one PHB is configured to persistently storeaccess control version identifiers associated with access controlpolicies that were in effect at the time a message was published. 15.The system of claim 11, wherein multiple paths exist between a PHB andSHB, and IBs on different paths need not maintain identical accesscontrol state.
 16. The system of claim 11, wherein at least a portion ofthe IBs maintain access control version vectors, with one version perSHB, rather than maintaining access control rules.
 17. The system ofclaim 11, wherein each SHB maintains the latest access control rules forprincipals that are connected thereto.
 18. The system of claim 11,wherein: (i) an SHB subscribes to access control rule changes forprincipals connected thereto; (ii) IBs filter access control rulechanges by principal, (iii) reliable delivery is used to ensure thataccess control rule changes are received by the SHBs that need them, and(iv) an SHB that accepts a connection from a new principal uses arequest-response protocol to initialize the access control rules forthat principal.
 19. The system of claim 18, wherein: (i) an SHBintersects subscriptions with the latest access control rules andassigns and maintains the control versions of the intersectedsubscriptions using the version of the control rules; (ii) the SHBpropagates the resulting subscription with the access control versionidentifier to upstream IBs; and (iii) IBs maintain subscription statewith access control version identifiers.
 20. The system of claim 19,wherein: (i) PHBs include the access control version identifier in datamessages, (ii) IBs use subscription state to filter the message if theaccess control version identifier in the data message is no more thanthe access control version number in the subscription state, andotherwise send the message downstream; (iii) an SHB checks equality ofthe control version identifier of the intersected subscriptions thatmatch a message with the control version of the message to enforcesubscribing access control rules.